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Abstract 

Bayesian inference methods are applied within a Bayesian hierarchical modelling framework 
to the problems of joint state and parameter estimation, and of state forecasting. We explore 
and demonstrate the ideas in the context of a simple nonlinear marine biogeochemical model. 
A novel approach is proposed to the formulation of the stochastic process model, in which 
ecophysiological properties of plankton communities are represented by autoregressive stochastic 
processes. This approach captures the effects of changes in plankton communities over time, and 
it allows the incorporation of literature metadata on individual species into prior distributions 
for process model parameters. The approach is applied to a case study at Ocean Station Papa, 
using Particle Markov chain Monte Carlo computational techniques. The results suggest that, 
by drawing on objective prior information, it is possible to extract useful information about 
model state and a subset of parameters, and even to make useful long-term forecasts, based on 
sparse and noisy observations. 

Keywords: Bayesian Hierarchical Modelling, Data Model, Inference in Nonlinear Models, Predic- 
tion, Parameter (Prior) Model, Stochastic Process Model, Uncertainty 



1 Introduction 

The last century has seen major advances in the ecological and earth sciences, both in the 
development of theoretical understanding, encapsulated in mechanistic process models, and in the 
development of sophisticated statistical theories and models for the interpretation and analysis of 



observations. However, as Berliner (2003 ) has pointed out, until recently the development of process 
models and the statistical analysis of observations have occurred in parallel and somewhat at arms 
length. Over the last two decades, there has been increasing effort devoted to the integration of 
observations and process models, so that model-data comparison and data assimilation are now 
key research topics. 
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There are a number of drivers for this increased emphasis on the integration of models and 
observations. The scientific community increasingly insists on the use of more objective and quan- 



titative measures or metrics to evaluate model predictions against observations (e.g. Allen et al. 



2007). But ecological and earth system models are increasingly used for practical purposes, from 



short-term environmental forecasting to local issues of pollution, conservation and renewable re- 
sources, to global issues of climate change. Users of model outputs would like more accurate 
predictions and increasingly demand formal assessments of the uncertainty in model predictions, 
to inform decision-making and risk-management. 

Techniques for the integration of models and observations are intended to quantify model perfor- 
mance and allow intercomparison of alternative models, to improve performance or skill in model 
predictions, and to provide error estimates or confidence/credible intervals around those predic- 
tions. Errors enter into an integrated model-data system from at least three sources. First, there 
are errors in the process of making observations, which typically provide a distorted and/or frag- 
mented glimpse of the underlying reality. One consequence is that we do not know the exact state 
of the system when we initialise dynamic models. Second, process models make simplifying as- 
sumptions and approximations, so that model simulations cannot be expected to reproduce reality 
exactly. Many ecological and earth system models are dynamic models, predicting the evolution 
of system trajectories over time, and model errors are typically stochastic, leading to divergence 
of simulated trajectories over time. Finally, process models typically incorporate a number of 
parameters, assumed constant over time, whose values are uncertain. 

The term "data assimilation" has been used broadly to describe model-data integration (e.g. 

In practice, approaches and applications have tended to fall into 



Gregg 


2008 


Luo et al. 


2011) 



one of two categories. In the first, attention has focused on the estimation of uncertain parameters 
in deterministic process models (e.g. Matear, 1995). Parameters are often estimated by minimizing 
some kind of cost function based on model-data mismatches, typically a sum-of-squared errors. In 
some cases, the cost function is constructed and interpreted as a negative log- likelihood based on a 
formal error model but, in other cases, the cost function is ad hoc. The second class of applications 
typically involves short-term environmental forecasting or hindcasting, where errors are believed to 
be dominated by uncertainty about the true value of the system state. Sequential data assimilation 
techniques are used to update estimates of the state based on current or recent observations. In 
these approaches, there tends to be a strong emphasis on building realistic observation models, 
while the stochastic model error is often modelled as simple additive white noise and adjusted to 
achieve convergence of the assimilation procedure. Very sophisticated data assimilation schemes 
are now widely adopted and routinely used in weather and ocean forecasting. 

The last decade especially has seen increasing advocacy of Bayesian approaches to data as- 



similation (e.g. Link et al. 2001 Berliner 2003 Calder et al. 2003 Cressie et al. 2009 Zobitz 



et al. 2011). Bayesian methods typically yield posterior distributions for the inferred state and 



parameters, most often summarised using large samples from these distributions. These can be 
particularly useful in applied contexts, where users may be interested in the probability distribu- 
tion of performance measures derived from model predictions. A key attraction of the Bayesian 
approach is its ability to formally incorporate prior information about models and parameters. 
Given that the rationale for using mechanistic, process-based models is that they build on prior 
scientific knowledge about the structure and function of system components, it makes sense to use 
methods that allow this knowledge to be formally represented in model-data comparisons. It is 
of course possible to use the Bayesian formalism, while discounting or ignoring prior information, 
through uninformative priors or empirical Bayes methods. In these cases, Bayesian methods can 
generally be shown to be equivalent to classical methods (e.g. Ver Hoef 1996 [Cressie et al" 2009). 
Within the broader Bayesian tradition, Bayesian Hierarchical Modelling (BHM) offers a par- 
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ticularly attractive framework for the integration of mechanistic process models and observations. 
BHM provides a consistent, formal probabilistic framework combining error or uncertainty in model 



parameters, model state, model processes and observations (Wikle 



(Wiklc 


2003 


Berliner 


2003 


Cressie 



et al. 2009). This framework encourages the modeller to think carefully and systematically about 
the approximations and assumptions involved in process model formulation, about the observation 
process and the relationship between model state variables and observations, and about the rela- 
tionship between model parameters and independent prior knowledge. One can think of BHM not 
just as an integration of models and data, but as a deep integration of mechanistic and statistical 



modelling; Berliner (2003) describes this as "physical-statistical" modelling. 



The last decade has seen a rapid growth of Bayesian applications in ecology and the earth 



sciences, ranging from population dynamics and dispersal (e.g. 


Link et al. 


2001| 


Calder et al. 


2003| 
2007[ 


Wikle, 2003[ 


Glark and Bjornstaad[ |2004[ |Clark and Gelfand 20061 


Barber and Gelfand 


Hooten et al. 


2007) to plant ecology and terrestrial surface fluxes (e.g. 


Ogle et al. 


2004 



Baker et al., 2006; Sacks et al., 2006 Xu et al., 2006; Zobitz et al., 2007 2008) to ocean circulation 



and climate (e.g. Berliner et al. 2000 Berliner, 2003). Encouragingly, Bayesian approaches are 



now widely and successfully used for stock assessment and fisheries management ( Maunder , 2004 ) . 

In this paper, we focus on the application of Bayesian methods, specifically BHM, to aquatic 
biogeochemical (BGC)/ecological models. Model-data integration in this field has paralleled the 
broader trajectory outlined above. Earlier studies focused on the problem of parameter estimation 
in deterministic models (Matear, 1995). Over the last decade, and following developments in 
data assimilation into physical ocean circulation models, there has been considerable progress in 
implementing sequential data assimilation techniques for state estimation in 3-D biogeochemical 
models (Gregg, 2008). Examples of Bayesian approaches in this area fall into two streams. The first 



uses a Bayesian approach to obtain posteriors for parameters and state estimation in (effectively) 



deterministic eutrophication models (Arhonditsis et al. 2008, 2007 Zhang and Arhonditsis, 2009). 



The second, in contrast, uses sequential Bayesian assimilation to obtain posteriors for current and 
forecast state in stochastic models in which the underlying parameters are assumed constant and 



known (Dowd and Meyer, 2003 Dowd, 2006 2007). More recently, Dowd (2011) has extended this 



work to obtain joint posteriors for the state and a subset of parameters. These examples all embed 



the ecological dynamics physically within a 0-D box model setting, but Mattern et al. (2010) extend 
this to a 1-D setting. 

The study presented here aims to build on previous work by using the BHM probabilistic frame- 
work to underpin enhancements in several areas: 

1. The process models used here include stochastic errors in a way that accounts for key simpli- 
fying approximations made in replacing communities of species by a single biomass variable. 
These approximations are widely used in ecological and biogeochemical models, and the ap- 
proach seems likely to find broader application. 

2. Our approach also allows prior distributions for model parameters to be more directly and 
objectively related to prior information obtained from field and laboratory studies, and from 
in literature meta-data. This prior information makes a valuable contribution to state es- 
timation and forecasting in the application considered here, where observations are severely 
limited. 

3. The process model has been modified to include a diagnostic variable, Chlorophyll a, to 
support a simpler and more rigorous observation model. 

4. Bayesian inference in nonlinear problems is generally analytically intractable, and computa- 
tionally intensive simulation-based methods, such as Markov chain Monte Carlo, are used 
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to obtain large random samples from the posterior. Our study exploits new methods for 



Bayesian inference (Andrieu et al. 2010) to derive a joint posterior for parameters and state 



in nonlinear dynamical models. This allows us to simultaneously address problems of pa- 
rameter estimation, state estimation, short-term forecasting and long-term projections in a 
unified probabilistic framework. 



The remainder of this paper is organised as follows. In Section [2j we provide a brief introduction 
to BHM and its application to dynamical state-space models. Section [3] presents a reformulation 
of a conventional deterministic model as a stochastic process model within the BHM framework. 
Uncertainty in the parameters is captured through a collection of time- varying stochastic processes. 
In Section |4j we provide a case study of this generic model applied to a time series of observations 
at Ocean Station Papa. Bayesian inference procedures are used to extract information in the form 
of posteriors for state and parameters from a set of observations that are sparse and patchy in time, 
and include only a subset of state variables. Twin experiments are used to test the performance and 
consistency of the inference procedures, and to draw some preliminary conclusions about the effect 
of observation intensity on posteriors. Section [5] discusses the results obtained in the context of the 
enhancements listed above, and we make some observations about the strengths and weaknesses of 
this approach for marine biogeochemical modelling, and ecological modelling more broadly. This 
is followed by mathematical, statistical and computing appendices. 



2 General Methodology 

2.1 Bayesian Hierarchical Models (BHMs) 



The physical-statistical models described by Berliner (2003), formulated as BHMs, are models 
that explicitly represent three sources of uncertainty: 



1. Data model: Expresses uncertainty arising from observations subject to measurement error 
and bias. 

2. Process model: Expresses uncertainty arising from scientific (here, biophysical) processes that 
are not completely understood or they are approximated. 

3. Parameter (prior) model: Expresses uncertainty arising from parameters not known exactly. 

BHMs are probabilistic models, constructed from conditional probability distributions. The 
data are treated as conditional on the process and some parameters, and the process is treated as 
conditional on other parameters. Hence, the three components, data, processes and parameters can 
be thought of as hierarchical levels in a chain of conditional dependence, which we now formalise. 

Let the data (observations), process(es) and parameters be represented by the vectors Y, W 
and 6, respectively. In some models, the process has a continuous index in time or space; for 
the purpose of computations it is enough to consider W as a high-dimensional vector. The joint 
uncertainty is denoted [Y, W, 9], where the notation [A] represents "the probability distribution of 
A." It makes sense to partition the parameters into biophysical parameters and so-called statistical 
parameters arising from the observation process. Therefore, we write 6 = {#y> $w}- 

Applying the rules of conditional probability, we can factorise the joint probability distribution 

as: 

[Y,W,0] = [Y|W,0 Y ,0 w ][W,0 Y ,0w], (1) 
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where [A|B] denotes "the conditional probability of A given B'' 
component of ([I]), we find: 



Repeating this for the second 



[y,w,0] = [Y\w,e Y ,e w }[w\e Y ,e w }[e Yl e w ] 



(2) 



The components of Q may be simplified a little by noting that the biophysical parameters, 0w 5 
are not needed in the data model when we also condition on the process; similarly, the statistical 
parameters, 6 Y , are not needed in the second component when we also condition on the biophysical 
parameters. Hence, we obtain: 



[Y,W,0] = [Y|W,0 Y ][W|0 w ][0Y,0w] 



(3) 



We see that the three probability distributions on the right-hand side correspond to the BHM 
hierarchy of sources of uncertainty identified above, representing a data model, a (stochastic) bio- 
physical process model, and a parameter model, respectively. The parameter model is often referred 
to as the prior distribution. 

Of key interest is how one can make inferences about the unobserved process state W and 
the parameters 0, given the observations Y on the biogeo chemical process. Appealing to Bayes' 
Theorem (e.g., Cox and Hinkley, 1986, pp. 365-367), we may write: 



[W,6\Y] oc [Y|W,0 Y ][W|0 w ][0 Y ,0 w ], 



(4) 



where the constant of proportionality is a function of Y only and guarantees that the right-hand side 
of Q is a proper joint probability distribution. This so-called posterior distribution is proportional 
to the product of the three levels of the BHM (data model, process model, parameter model) that 
we have developed above. We return later in this section to the issue of making inferences based 
on Q. 

The use of the three levels of conditional probability models via Bayes' Theorem to learn from 
data is precisely the BHM framework we alluded to at the beginning of this section. Examples of 
its use have been growing in the last decade. It was introduced in a climate-modelling and climate- 



prediction context by Berliner et al. (2000), in an introductory geophysical context by Berliner 



(2003), and in an ecological context by Wikle (2003); see also the review by Cressie et al. (2009) 



2.2 A State Space Representation 

We are interested here in the application of BHM to dynamical systems, in which the state 
evolves as a function of time (discrete or continuous), and the data are collected by sampling 
(potentially irregularly and coarsely) in time, whilst the process evolves at a relatively fine time 
step. We write the time-evolving process W as (Wo, Wi, . . . , Wt) with corresponding observations 
(Yi, . . . , Yt) taken after the initial value of the process Wo- We use subscript t to index time, 
such that W t is coincident with Yt, for t = 1, . . . , T. A graphical depiction of the dependencies is 
shown in Figure [T] below. 



Wo - 


-> Wi -> 


... Wt _ 1 _ 


■+ w t - 


-> W t+ i -> 


... W T 




I 


1 


I 




1 




Yi 


Y t _i 


Y t 


Y t+ i 


Y T 



Figure 1: A graphical representation for the process W and observations Y 

We remark that, in practice, observations will be missing at some times, which the BHM framework 
can readily handle. 
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We henceforth assume that the forward evolution of the process W depends only on the current 
state; that is, W is a Markov-process model described by [Wt|Wt_ i, #w]> f° r t = 1> ■ ■ ■ ,T. This 
form of conditional independence implies that [W|#w] = 11^=1 [Wf|Wt_ \, 0\v]. Further, observa- 
tions at time t are assumed to be independent of observations at other times, conditional on the 
state W t . Thus, the data model has the form, [Y|W,0 Y ] = Ut=l [ Y *l W t> #y]. 



2.3 Statistical Inference 



The focus of our statistical inference is the calculation of the posterior distribution described 
by Equation Q, which is rarely amenable to analytic solutions. As a result, modern Bayesian 
inference has harnessed efficient algorithms deployed on contemporary computing architectures to 
simulate samples from the posterior distribution. Statistics calculated for these samples, such as 
means and quantiles, can be shown to converge to the appropriate quantities for the posterior 



distribution (Tierney 1994). 



Suppose for instance that we are interested in estimating some function g(-) of the state and 
parameters. We obtain a simulated sample {(W^,#^): £ = 1,...,L} from the posterior dis- 
tribution [W, 6\ Y], and we use the transformed sample {g(W^ , 0^) : I = 1, . . . , L} to calculate 
summary statistics. For example, we can estimate the mean as: 



E(g(W,0)\Y) = (l/I)£ fl (wM,#), 



so sampling from the posterior distribution over states and parameters is key to the success of 
Bayesian hierarchical modelling in this context. The computational approach adopted must also 
be able to cope with the nonlinear behavior of the process model, noting that the state transition 
density function is not available in closed form. 

Particle Markov chain Monte Carlo (PMCMC) was developed for exactly this situation, and so 
we have applied it in our case study. In particular, we use the particle marginal Metropolis-Hastings 



(PMMH) sampler (Andrieu et al. 2010), which we have previously applied successfully to a simple 
Lotka-Volterra type model (Jones et al. 2010). Details of PMMH are given in Appendix [C] 



3 Reformulating a Marine BGC Model as a BHM 

A general description of the BHM framework and its use for scientific inference was given in 
Section [2] We now show how these ideas can be applied in a marine BGC setting. 

3.1 The process model 

Recall from Section [2] that the biogeo chemical process model is at the second level of the BHM 
hierarchy. We present the model first in terms of a deterministic model, and then we derive a 
stochastic version of it. 



3.1.1 A deterministic biogeochemical process model 

One of the advantages of the BHM framework is that it allows us to build on existing scientific 
understanding, typically incorporated in deterministic process models. We can draw here on a 
long and rich history of (deterministic) marine BGC models that describe the cycling of nutrients 
(e.g. nitrogen) and/or carbon through living and nonliving organic and inorganic compartments, 
in simplified marine ecosystems. Open-ocean models typically deal only with pelagic planktonic 
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systems, while coastal models may deal with coupled pelagic-benthic systems. In this article, we 
deal with the simpler case of pelagic models. 

In the general case, the state variables in marine BGC models are expressed as component 
concentrations (mass per unit volume) as functions of space x and time t. These components 
are subject to physical transport (advection and mixing), as well as local biological and chemical 
reactions. If c(x,i) is a vector of state variables, we can write the general reaction-transport 
equation as: 

dc 

— = R(c,x,t) + T(c,x,t), (5) 

where R represents local biological and chemical reactions, and T is a transport operator; see 
Appendix A. 2 for the specific form of ^ used in the case study in section |4j In this paper, we 



consider the highly simplified physical setting of a mixed-layer one-box model, and for the moment 
we ignore the transport operator and focus on the local reactions R. This setting allows us to 
formulate a BHM most clearly. However, we do include a simple transport term to account for 



vertical mixing in the case study and this is presented in Appendix A. 2 

Pelagic planktonic ecosystems are complex systems that involve many species of phytoplankton 
and zooplankton, multiple (potentially limiting) nutrients, and dissolved and particulate organic 
matter pools comprised of complex mixtures. All models of these systems require simplifying 
approximations, and the level of detail varies across models and depends on the purpose of the 
model. Model detail and complexity have tended to increase over the last decade, as scientific 
understanding and computational power have increased. However, this in turn has led to concern 



about the identifiability of complex models with many uncertain parameters (Hood et al. 2006). 

We have chosen a relatively simple, classic NPZD model formulation, which represents the 
cycling of a limiting nutrient (nitrogen) through four compartments: dissolved inorganic nitrogen 
or DIN (N), phytoplankton nitrogen (P), zooplankton nitrogen (Z), and detrital nitrogen (D). We 
can write the equations for the local rate of change of the state variables as: 

^ = 9 ■ P ~ gr • Z, (6) 
^ = E z ■ gr ■ Z - m ■ Z, (7) 
~ = (1 - E z ) ■ Id • gr ■ Z + m ■ Z - r ■ D, (8) 

rlN 

— = -g ■ P + (1 - E Z ) ■ (1 - f D ) ■ gr ■ Z + r ■ D. (9) 
at 

Notice that 7g~ + ^f + ^ + ^r = which is a consequence of "mass balance" in the currency 
of nitrogen. In (j6J) - g is the phytoplankton specific growth rate (per day, or d _1 ), gr is the 
zooplankton specific grazing rate (mg P grazed per mg Z d _1 ), m is the zooplankton specific 
mortality rate (d _1 ), and r is the specific breakdown rate of detritus (d _1 ). A fraction Ez of 
zooplankton ingestion is converted to zooplankton growth and, of the remainder, a fraction fjj is 
allocated to detritus, with the rest released as dissolved inorganic nitrogen, N. The fractions, E z 
and fjj, are treated as constant, independent of ingestion rates. This is a common simplifying 



assumption in biogeochemical models (e.g., Wild- Allen et al. 2010). 



The process rates g, gr, m, and r are all functions of state variables and/or exogenous forcing 
variables, and hence they are functions of time. As we shall see below, a multiplicative temperature 
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correction Tc is applied to all rate processes; to define Tc, we use a so-called "Qio formulation" for 
dependence on temperature T: 



Tc = Qf- T ^ )/W . 



(10) 



Notice that T depends on time and, hence, so does Tc, where reference temperature, and 

Qio is a prescribed parameter. 

We use a flexible formulation for the dependence of zooplankton's grazing rate on phytoplankton 
concentration (zooplankton functional response): 



gr 



Tc-Iz- A^ 



(1 + A V ) ' 

where v is a given power; the relative availability of phytoplankton A is 



A 



Cl z -P 
Iz 



(11) 



(12) 



where A depends on time because P does. In (12), Iz is the maximum zooplankton ingestion rate 



( m g p per mg Zd -1 ); and Clz is the maximum clearance rate (volume in m 3 swept clear per mg Z 
d _1 ). Both are constant in the deterministic formulation. This is a standard rectangular hyperbola 
or Type-2 functional response (Holling, 1965) when v = 1, and a Type-3 sigmoid functional response 
when v > 1. 



We follow Steele ( 1976 ) and Steele and Henderson ( 1992 ) in adopting a time-dependent quadratic 



formulation for zooplankton mortality: 



m 



Tc-rriQ-Z, (13) 

where the constant quadratic mortality rate tuq has units of (mg Z m -3 ) -1 d _1 . The detrital 
remineralization rate is assumed to depend only on temperature (which is time dependent): 



Tc-r D . 



(14) 



where the constant parameter td prescribes the rate at the reference temperature and has units of 
d- 1 . 

Finally, the phytoplankton specific growth rate g depends on temperature T, available light 
or irradiance E (see Appendix A. 3 ) and dissolved inorganic nitrogen N. The submodel given 
below for g is somewhat more elaborate than the submodels used for the other rate processes. 
We shall see that it predicts changes in phytoplankton composition (nitrogenxarbon ratio and 
chlorophyll-axarbon ratio) as well as the phytoplankton specific growth rate, as phytoplankton 
adapt to changes in available light and nutrients. 

In the BHM framework, we are encouraged to pay careful attention to the relationship between 
process model variables and what we can observe. For example, the process model predicts phy- 
toplankton biomass P in the currency of mg N m -3 , but we typically measure phytoplankton as 
a pigment (mg Chla m~ 3 ). The submodel given in the following paragraphs allows us to relate 
these chlorophyll observations (Chla) more rigorously to the state variable P. Our formulation 



represents a variant on models proposed by Geider et al. (1998), and details of our derivation are 



given in Appendix A.l 



The phytoplankton specific growth rate g is expressed in terms of g max (in units of d ), a 
constant maximum specific growth rate at the reference temperature, T re f, a light-limitation term 
h,E, and a nutrient-limitation term, hj^. That is, 
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g = Tc- g max ■ h E ■ h N /{h E + h N ) . (15) 
The light limitation term is given by 

h E = l- exp(-a • X max ■ E/g max ) , (16) 

where a is the initial slope of the photosynthesis versus irradiance curve (mg C mg Chla -1 mol 
photon -1 m 2 ), and \ max is the maximum chlorophyll-axarbon ratio (mg Chla mg C _1 ). The 
parameter a = ach • Q 1S the product of the chlorophyll-specific absorption coefficient for phyto- 
plankton, ach (m 2 m g Chla" 1 ), and the maximum quantum yield for photosynthesis, Q (mg C 
mol photons^ 1 ). 

The nitrogen limitation term is given by 



h N = N/((g max .Tc/a N )+N), (17) 

where a at is the maximum specific affinity for nitrogen uptake (mg TV -1 m 3 d _1 ). 
The phytoplankton nitrogenxarbon ratio, x> predicted by the model is given by: 

X= ' (18) 

where ^ mm and ^ max are the minimum and maximum nitrogenxarbon ratios (mg N mg C -1 ). 

The model predicts the phytoplankton chlorophyll-axarbon ratio A, and this can be combined 
with the nitrogenxarbon ratio % to convert phytoplankton biomass P (mg N m -3 ) to a predicted 
Chla concentration as: 



Chla = P ■ (A mffl 7 x ma:c ) • h N ■ Tc/(R N ■ h E + h N ), (19) 



max 
max 



where Rn = ^ min /^ max . This growth model involves six parameters (^g max J a, \ max , a^, x 
Rn)- The parameters a, \ max and x max appear only in terms of the ratios a / A max , and A 
/ x max i but since ^ max is fixed based on the Redfield ratio, this does not result in redundant 
parameters in our inference procedure. 

While this completes the specification of the local reactions R given in ^ , in the simple one-box, 
mixed- layer (i.e., 0-D) model adopted here, we do need to allow for effects of physical exchanges 
between the mixed layer and the underlying water mass. These exchanges add additional source- 
sink terms to the right-hand sides of Equations ©-(p)]), and these are specified in Appendix A. 2 



3.1.2 From a Deterministic to a Stochastic BGC Process Model 

The BHM framework encourages us to formulate the state or process model in probabilistic 
or stochastic terms, in order to capture the effects of approximations and errors in the process 
representation. Note that a stochastic-model formulation is not equivalent to recognising prior 
uncertainty in the (constant) parameters in a deterministic model. A deterministic model effectively 
asserts that, given the initial state and the parameters, the future state can be predicted exactly 
at all future times. A stochastic model asserts that, given the model state and parameters at the 
current time, we can make statements only about the probability distribution of the state at future 
times. 

A deterministic model of the kind described in Section 13.11 can be converted to a stochastic 
model in a number of ways. The simplest approach is to introduce an additive error term on the 
right-hand side of equations , either as a continuous Wiener process for the differential equations 
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Q-Q, or as a Gaussian error term at each time step in the discretised version. We have not 
adopted that approach here; we have tried instead to introduce randomness into the process model 
in a way that better reflects the approximations we make in formulating such models, and that 
preserves mass balance. Specifically, we replace the constant ecophysiological parameters in the 
deterministic model with stochastic processes that change as the underlying plankton community 
composition changes. In the remainder of this section we provide motivation for, and a detailed 
explanation of, this approach. 

A key approximation made in formulating models like the one given in Section 3.1 involves bio- 
logical aggregation. Phytoplankton and zooplankton communities, which consist of many different 
species, are each represented in the model by a single compartment. More complicated models may 
divide phytoplankton or zooplankton biomass into two or more functional groups with different 
ecological roles, but each group still constitutes an aggregation of diverse species. The model for- 
mulations used in Section 3.1 are largely derived from many, many laboratory studies of individual 
species or isolated samples, which give us reason for confidence in the structural form of the models. 
However, these studies also show very large levels of variation in many of these eco-physiological 
parameters, across individual species, or across field samples. Hence, the properties assigned to 
functional groups in these models must be thought of as representing some kind of average across 
the community of species making up the functional group. 

The key point here is not just that variation exists, and so there is uncertainty in specifying these 
community properties, but that community composition varies over time, and so the community 
parameters must also be expected to vary over time. In models like those given in Section 3.1, we 



do not attempt to explain or predict these changes in community composition (and consequently in 
community properties) mechanistically, but we can account for them by treating them as stochastic 
processes. Now, we expect some level of persistence in community composition, so it does not 
seem realistic to treat community properties as being drawn independently from some underlying 
distribution at each time step. Instead, we allow for community persistence by treating community 
properties as the outcome of a first-order autoregressive stochastic process. 

This means that if b is a generic biogeochemical parameter in the deterministic model, we replace 
b by a stochastic BGC process B in the model, with 



B(t + At) = B(t) ■ (1 — At/r) + Cs(i) ■ Ai/r, for |1 - At/r\ < 1. (20) 

Here, At is the discrete time step (assumed to be 1 day in our example), r is the characteristic time 
of the autoregressive process (that is, the time scale on which community composition changes), 
and {Cb(0} represents a sequence of independent and identically distributed random variables 
with distribution [Cb]- Detailed properties of this process required for our study are provided in 
Appendix [Bj 

We can obtain prior information on the distribution [Cb] by considering past laboratory and 
field studies. In fact, meta-analyses of past studies for many ecophysiological parameters have been 
conducted by researchers looking to establish systematic relationships between these parameters 
and individual size. These analyses show that parameters typically vary over orders of magnitude, 
so there is good reason to propose log-normal distributions for [Cb] (i.e., normal distributions for 
log(Cfl(t))), for most parameters. 

There are some further complications we need to consider in making the step from a meta- 
analysis of laboratory studies to specifying a prior for distributions like [Cb]- The meta-analyses 
summarize results of measurements on individual species drawn from a wide variety of locations, 
but the processes B refer to means over the community of species present at a particular loca- 
tion. We would expect the variance of the community mean to be less than the variance over the 
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constituent species; this effect is dealt with explicitly in Appendix [Bj It is also possible that the 
species comprising a functional group at a particular location will be less diverse, and may exhibit 
lower variance, than the species represented in meta-analyses. We denote the ratio of the coefficient 
of variation (CV) of community mean parameters to the CV of species parameters by PDF for 
phytoplankton, and ZDF for zooplankton. In Appendix |B| we relate these ratios to measures of 
community diversity. 

Because of the lognormal nature of the autoregressive error Cs(^) i n (|20[), w e consider the mean 
of B, E(B), and the coefficient of variation of B, namely CV(B) = y/Var(B)/ E(B). Appendix |b| 
shows how it is possible to choose the mean and variance of logCe(i) such that E(B) and CV{B) 
are consistent with the mean and variance of individual species properties, given the values of PDF 
and ZDF. We treat PDF, ZDF and the expected value E(B) = /z B , where B is the set of all 
BGC autoregressive processes, as parameters in Ow We also assume characteristic time scales for 
changes in phytoplankton community composition (rp), and likewise for zooplankton community 
composition {jz)- 

We need to establish priors for the parameters controlling the behaviour of the autoregressive 
processes: PDF, ZDF and /x B . We set broad, relatively uninformative priors for PDF and ZDF. 
We also set relatively uninformative priors for the components of /x B , by assigning them the same 
distribution (mean and variance) used to describe the individual species parameters, based on the 
meta-data (Appendix [B]). This means that the prior distribution allows the community parameter 
to take on the most extreme values revealed by individual species. For further information on priors 
and their derivations, see Appendix |A.4 and Section 3.2 



We can now translate the stochastic BGC process model into the BHM formalism presented in 



Section 2. The process W, as defined in Section 2.1 can be split into the state vector X and a 



vector B that recall is the set of autoregressive BGC processes. That is, 

[W] = [X,B], (21) 

where the state is X = {N, P, Z, D} and the (random) BGC processes are B = {g max , \ max , i? n , a^v, 
Iz, Clz, Ez, Td, tuq}. Similarly, 6\y in ^ can be split into two parameter sets, those appearing 
explicitly in the equations updating X, namely Ox = {K\Vi a Ch-> s d > J 'd] > and those appearing in 
the autoregressive equations for the BGC processes B, namely Ob = {PDF, ZDF, fj, g max, [i\max, 
/^Rjvi /V/v> fJ'iz, l^cizi ^E z , fJ-r D , ^m Q } , taking note that PDF and ZDF effectively scale the 
coefficient of variation, CV(B), given in [I] of Appendix A. 4 . The state-space representation is now 
as given in Figure [2] below. 

[Xo] -> [Xi] -> ... [X t _i] -> [X t ] -> ... [X T ] 

[B ] -> [Bi] -> ... [Bt-i] -> [B t ] ^ ... [B T ] 

Figure 2: Evolution of the state (X) and the BGC (B) processes. Recall that W = [X,B] and that 
Figure 1 shows how observations Y are related to the process W. 

In terms of conditional probabilities, the formulation developed in this section means that: 

[W t |W t _i,0 w ] = [Xt.BtlXt-i.Bt-i,^] = [XilX^B^flxKBtlB^flp], (22) 

where the last equality expresses the fundamental evolution of the process model (Section |3.1.2 ). 



11 



In press at Ecological Applications 



3.2 The parameter (prior) model 

The priors assigned to the parameters specified in this study were drawn from a meta-analysis 
of the literature. A summary of the prior information available for the BGC parameters and 



processes, and the sources of this information, is given in Appendix A. 4 Each component of the 



prior is assumed independent of the other components, and no attempt has been made to introduce 
any dependence structure between the parameters. 



3.3 The data model 

The data model explicitly links the process model with the observations. The parameters 6y in 
([2]) control the observation process, and we consider two broad classes of observation error. 

1. Analytical measurement errors should reflect the precision of in situ instruments or laboratory 
analyses. For example, laboratory determinations of chlorophyll-a pigment concentration 
might be expected to have a precision of a few percent. 

2. Representation errors can arise from (i) mismatches in scale (we may model a large volume 
of ocean, many kilometres across, but make measurements on bottle samples comprising a 
few litres), and (ii) mismatches in type (we may predict zooplankton concentration in the 
currency of nitrogen, but measure volume or wet weight of biomass) . 

In most real-world situations, errors associated with mismatches in scale and type outweigh 
analytical measurement errors. The use of a simple 1-box mixed layer model here introduces 
an additional ambiguity. We are neglecting horizontal advection, which might be thought of as 
an additional process-model error. The significance of horizontal advection compared with local 
processes depends on the area of ocean represented by the box. If we regard the box as representing 
an ocean area several hundred kilometres in extent, we might hope that the errors involved in 
neglecting advection are small. But we must then expand the observation error to account for the 
spatial variability observed on these length scales. 

In Section |4j the data model for our application to data from Ocean Station Papa is given by: 

T 

[Y|W,0y]= JJ[Y t |X*,e y ]. (23) 
t=l 

Treatment of 6y for our case study is discussed in section [4} Recall that W is made up of X 
and B; note that if we had direct observations of the ecophysiological properties represented in B, 
these could be incorporated into the data model. 



4 Learning and Predictability Given Observations 

We demonstrate the application of the BHM framework to a marine BGC model using the 
historical Ocean Station Papa (OSP) dataset as a case study. This site was chosen over alterna- 
tive subtropical time series sites because the simple mixed layer model is believed to be a better 
approximation at OSP. Two experiments were conducted: 

1. A twin experiment was run using climatological forcing at OSP, with synthetic observations 
of all state variables assimilated daily. The synthetic observations were generated by adding 
noise to a known "true" trajectory through the state space. 
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140°W ~ 130°W 



Figure 3: A map of the north-east Pacific Ocean displaying the location of Ocean Station Papa 
(Stn. P) with range circles at 100, 500 and 1000 km. 

2. A subset of the historic OSP dataset comprising observations of chlorophyll-a (Chla) and 
nitrate (N) was assimilated for the period January 1971 - November 1974. This corresponds 
to part of a sustained observing campaign, and we found that the marginal posteriors for 
parameters did not change greatly if additional years were included. 

4.1 Ocean Station Papa Site Description 

Ocean Station Papa (OSP) is located at 50°N, 145°W (Figure [3]), in 1500 m of water in the sub- 
arctic region of the north east Pacific Ocean. It experiences a strong seasonal cycle in temperature, 



wind stress, and incident solar radiation (Whitney and Free-land, 1999). During winter and spring 



a mixed layer of depth 80 - 120 m is sustained by a high wind stress with the low incident solar 
radiation unable to induce any persistent stratification of the water column. During summer, the 
thermocline shallows in response to increased surface heating and a reduction in the wind stress. 
Consequently, a relatively shallow mixed layer is maintained of typical depth 25 - 40 m. 

It has been noted that there are persistently high macro-nutrient concentrations in the mixed 
layer and the phytoplankton biomass is typically low. This phenomenon is observed throughout 
much of the open sub-arctic Pacific ocean. While the concentration of dissolved inorganic nitrogen 
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(DIN) is lower in summer than in winter, it is rarely if ever depleted to levels that may cause 



nutrient limitation in primary producers (Harrison 



chlorophyll-a. Previous modelling studies of Matear 



2002 ) . There is no discernible seasonal cycle in 



(1995), Denman and Pena (1999) and Denman 



(2003) discuss the likely controls on phytoplankton biomass and the seasonal variation in primary 



productivity and zooplankton biomass. 



4.2 Learning from Observations: Twin Experiment with Climatological Forcing 

Twin experiments in a setting like that of OSP have been conducted to compare samples from 
the posterior, [W, #|Y], produced by Bayesian inference, with known "true" values of the state and 
parameters. The term "twin" , borrowed from the data-assimilation literature, refers to experiments 
where the model used for inference, and the model from which synthetic observations are generated, 
are the same. Model forcing and boundary conditions are taken from Matear (1995) and are 
climatological in nature; details are given in Appendix |Dj 



4.2.1 Twin Experiment: Design 

To generate the synthetic observations, we select a parameter set 6* (the "true" parameters) and 
take a single realisation of the stochastic model {W^ : t = 0,1, ... ,T} to produce the trajectory 
{Xjf : t = 0, 1, . . . ,T} through state space (again referred to as the truth). We have chosen a set of 
"true" parameters in the twin experiment that are shifted away from the prior means (to provide a 
clearer test of the inference procedure), but that nevertheless yield state- variable trajectories quali- 
tatively consistent with OSP observations (e.g., high- nutrient low-chlorophyll (HNLC) conditions). 
The (synthetic) observations Y are generated by: 



Y i = X*exp(6); t = 0,l,...,T, 



(24) 



where £j are independent and identically distributed (IID) as the normal distribution N(0, a^ bs ). 
The standard deviation, o~ bs, was 0.1 for DIN observations and 0.2 for observations of the re- 
maining state variables. The log-normal error model was adopted because errors in the estimates 
of plankton density are typically better represented by log-normal multiplicative error than by 



additive normal error (Campbell, 1995), and the log-normal multiplicative-error model delivers 



synthetic observations that are non-negative. The observation errors are assumed to be indepen- 
dent over time, reflecting either analytical error or (more likely) uncorrelated small-scale variation 
in concentrations. 



4.2.2 Twin Experiment: Results 

We first generate an ensemble of model trajectories by sampling from the prior distribution for 
parameters and running the stochastic model forward through the period January 1971 - November 
1974, without assimilating any observations. This so-called free-run process-model ensemble is pre- 
cisely a sample from the prior distribution over the state (Figure [4| blue shading), which expresses 
the uncertainty in the state based only on the prior knowledge of the parameters gained from a 
meta-analysis of the literature. In spite of the large prior uncertainty in some of the process-model 
parameters, the median values of the (marginal) prior distributions over state variables show sur- 
prisingly similar qualitative behavior to the observed climatology at OSP (Figure |4j dark blue line). 
The median DIN values remain elevated, and median chlorophyll-a values remain low. However, the 
95% contours of the prior ensemble include unrealistic behaviours not observed at OSP, involving 
near-complete depletion of DIN and intense phytoplankton blooms. 
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When the synthetic observations described in Section 4.2.1 are assimilated, using the method- 
ology described in Section [3j the 95% credibility intervals for the posterior distribution of the state 
are very tightly constrained about the true trajectory (Figure |4j red shading), compared with 
the prior intervals and with the observations. Despite the 20% observation error, the dynamical 
BHM implemented through the PMCMC described in Appendix [Cj accurately tracks the true state 
(Figure |4j green line) . 

The case for N deserves additional explanation. The seasonally varying N concentration, pre- 
scribed below the mixed layer as a boundary condition, imposes a sharp upper limit to the predicted 
mixed- layer N concentrations. Provided grazing control keeps phytoplankton biomass and N utiliza- 
tion small, the predicted concentration is very close to this upper bound. In most prior trajectories, 
grazing control is effective, so the prior median is close to the upper limit. Some prior parameter 
combinations allow phytoplankton blooms and N depletion, resulting in the drawdown of N to 
near-zero levels seen in the prior lower 95th percentile for N. The truth is chosen to be OSP-like, 
and so produces N concentrations close to the upper bound. Since we add noise to the truth, a 
significant fraction of the observations lie above the upper bound. 



The prior distributions over the parameters given in Table [T] of Appendix A. 4 are the blue curves 
in Figure [5j These priors are discussed in Section 3.2 and are considered "global" in that they 
represent experimental results encompassing a wide range of species and domains. For some model 
parameters (acTu S D, PDF, ZDF, ^ gma x, fj,^ ma x, \ici z i He z , ^r D , and fJ>m Q ), the marginal posteriors 
in Figure [5] show evidence of learning in that the posterior mode has moved towards the truth and 
the posterior variances have contracted compared with the prior. However, for others, the inference 
procedure appears to extract little or no information from the data, and the marginal posteriors 
appear to merely recover the prior distributions. This is true for the parameters controlling light 
attenuation due to water (K\y), the fraction of zooplankton waste diverted to detritus (/d), the 
parameters related to nitrogen uptake and nitrogenxarbon ratios (a^ and Rn), and the maximum 
zooplankton ingestion rate (Jz)- In the case of ajv, the posterior variance is slightly reduced, but 
the posterior median remains centred at the prior mean. 



The inference procedure generates posterior distributions for time series of the autoregressive 
processes B(t), and could provide information about changes over time in the ecophysiological prop- 
erties they represent. However, the results from this twin experiment are only mildly encouraging 
in this regard. In cases where the observations are uninformative about the parameters underlying 
the autoregressive processes, one can hardly expect to obtain information about the temporal vari- 
ation in the processes themselves. Indeed, in those cases, the posteriors for the stochastic-process 
trajectories are the same as the priors. In two cases (g max and Clz), the posterior median trajecto- 
ries appear to track the truth, although with consistent bias in the case of Clz (Figure [6]). But for 
these, and all other autoregressive processes, the 95% credibility interval for the posterior exceeds 
the amplitude of the temporal variation in the truth by some margin. The inference procedure 
does not allow us to conclude that there are significant changes in these processes over time. 

These results reflect the particular nature of the climatological forcing and system behavior at 
OSP. Given that concentrations of dissolved inorganic nitrogen at OSP remain well above levels 
expected to limit phytoplankton growth, it is unsurprising that parameters controlling nitrogen 
limitation of growth rates are poorly constrained. Similarly, phytoplankton biomass remains at 
levels well below those required to saturate zooplankton grazing, and zooplankton growth rates are 
controlled by the clearance rate Clz, not by the maximum ingestion rate. 
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Figure 4: Twin experiment: a time series of the prior and posterior distributional properties of the 
state (X). Note that the posterior credibility intervals remain so close to the posterior median that 
they are difficult to distinguish. The figure legend is shown on the lower right. 
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Figure 6: Twin experiment: a time series of the prior and posterior distributional properties of the 
autoregressive processes (B(i)). 
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4.3 Learning from Observations: Ocean Station Papa Dataset 

To demonstrate the application of the BHM approach to a real dataset, we have used a subsample 
of historical OSP data. 

4.3.1 Ocean Station Papa - Data Model 

Observations of nitrate (DIN) and chlorophyll-a taken between January 1971 and November 
1974 are used. Observation errors are large and dominated by spatial sampling errors, because we 
neglect horizontal advection and assume a large model domain with high levels of within-domain 
variability. The presence of larger observation errors means that the data will be less informative. 
We draw on a number of studies below for estimates of the appropriate levels of spatial variability. 

The spatial and temporal variability of Particulate Organic Carbon (POC) in this region has 



been investigated at a number of scales (Bishop et al. , 1999). The spatial variability in the vertical 



and horizontal directions was calculated from the beam-attenuation coefficient obtained from a 
transmissometer. 

• Small-scale horizontal variability (l-10km) of POC appears to be 5% to 10%, which is deemed 
negligible in comparison to the ocean scale and temporal variability. 

• Large-scale horizontal variability (100-300km) of POC appears to range from 10% to 40%, 
however we attribute some of this variability to the passage of weather systems on time scales 
of 5 to 10 days. 

• Ocean-basin-scale variability (800-2000km) exceeds both the large-scale and small-scale spa- 
tial variability, but this is due to the change from HNLC conditions in the deep ocean to a 
more typical temperate seasonal cycle on the continental shelf. 



Bishop et al. (1999) also noted significant interannual variability that may be linked to El Nino 



events. Nitrate data collected along the Line P transect (a 1425 km long transect between the 



coast adjacent to the Juan de Fuca strait and Ocean Station Papa (e.g. Pena and Bograd, 2007)) 
from 1992 to 1997 display a similar pattern to the POC data. Again, it appears that on the scale 
of 100-300km around OSP, variability in total concentrations of nitrates and nitrites appears to 



be 10% to 30%, with inter-annual variability exceeding the large-scale spatial variability (Whitney 



and Freeland 1999) 



Taking all these sources of information into account, we have assigned a CV (o" bs) of 0.5 to 
the observation error for both DIN and chlorophyll-a. This is a conservative (upper) estimate, 
representing an upper bound to spatial variation, and allowing for other non-spatial contributions, 
including analytical measurement error. 



4.3.2 Ocean Station Papa - Results from Hindcast 

A prior ensemble over the state was constructed in a similar manner to the twin experiment, 
using real forcing from January 1971 to November 1974. Model parameters were sampled from 
the prior distributions described earlier. As in the twin experiment in Section |4.2[ a wide range 
of model behaviors was observed (Figures 0and[8|, ranging from near-complete depletion of DIN 
during summer, to year-round grazing control. As in the twin experiment, the median of the prior 
over the state based on 1971-1974 forcing qualitatively agreed with observed OSP behavior, in that 
DIN was never limiting, and there were no strong phytoplankton blooms as zooplankton grazing 



maintained relatively constant phytoplankton biomass (Matear, 1995 Denman and Pena, 1999 



Denman, 2003). 
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When observations of chlorophyll-a and DIN are assimilated, the 95% credibility interval is 
dramatically reduced. Due to the relatively large observation error prescribed (see Section 3.3), 
the transient, low-magnitude increase in chlorophyll-a seen in the summer of 1972 is absorbed 
into the observation error and not tracked in the state. While the three individual observations 
of this anomalous bloom do not fall within the posterior 95% credibility interval, this cannot be 
interpreted immediately as lack of model fit. This is because the credibility interval depicted is over 
the latent chlorophyll-a state variable, not over the "noisier" observed chlorophyll-a; this distinction 
is important and is discussed by Cressie and Wikle (2011, Section 2.2.2). Although short-lived 
transient features are not tracked by the model, slow seasonal and intra-seasonal variations are well 
captured. The methods described in Section 2.3 not only condition the state on observations from 
previous times, as do filtering approaches, but also on future times. This is referred to as smoothing 
in the Bayesian filtering literature (Briers et al. 2010 Fearnhead et al. 2010). The advantage of 
such smoothing is evident in time periods where there are very few observations (e.g., mid-1973). 

Through the process model, Bayesian methods allow inference on the unobserved state variables 
P, Z and D; see Figure [8j Notice that there is a substantial reduction in the uncertainty expressed 
through the posterior compared with that expressed through the prior, even for unobserved state 
variables. For example, there is a strong seasonal cycle in the zooplankton biomass, which has been 
observed in a number of studies (Harrison, 2002). The peak in the zooplankton biomass occurs 
during mid summer, which coincides with a peak in primary production (not shown). 

The marginal posteriors for model parameters shown in Figure [9] demonstrate that the sparse 
and limited OSP observations carry very little information about many of the parameters. This 
was not unexpected; previous studies have also experienced difficulty in using the OSP dataset 
to estimate parameters in deterministic models (Matear, 1995). The large observation variances 
used here, which compensate for effects of advection, reduce the effective information content of 
the data, but we believe this is realistic, given the model structure. The posterior marginals show 
evidence of learning for four parameters: ZDF, g max , fici z and /i lz . 

Perhaps unsurprisingly, given the high noise levels and sparse observations, the OSP data do not 
allow us to derive useful information about temporal variation in the autoregressive processes B(t). 
Even for those parameters, g max and Cl z , where the observations appear to inform the posteriors 
for the underlying parameters, the posteriors for the autoregressive trajectories show no significant 
variation over time (not shown). 

One advantage of the BHM framework is that we can use the sample generated from the joint 
posterior of the state and parameters, conditioned on past observations, to assess the uncertainty in 
model forecasts and scenarios. In this case, we have used the posterior conditional on observations 
from January 1971 to November 1974 to make a probabilistic forecast for 1975. We do this simply 
by propagating all posterior trajectories forward the additional year, using the boundary and forcing 
fields for that year. The results from this forecast ensemble (median and 95% credibility intervals) 
are shown in Figure 10 Agreement with the (non-assimilated) observations in the forecast period 
is very good. 



5 Discussion and Conclusions 

A key consideration in building BHMs is the treatment of model error. In our study we used 
the fact that the aggregation of communities of species into single trophic levels or functional 
groups, and the replacement of well defined eco-physiological parameters for individual species by 
community- average parameters, is an important source of model error. Consequently, we have 
replaced the constant community parameters used in most biogeochemical models by stochastic 
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Figure 8: A time series of prior and posterior distributional properties of unobserved state variables 
comparing prior (blue) and posterior (red). 
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Figure 10: A forecast (magenta shading) for the model's state variables for the period December 
1974 - December 1975. The posterior and prior of the state are given by the red and blue shading, 
respectively. Observations are denoted by the open black circles; during the forecast period, the 
observations were not assimilated. 
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autoregressive processes that vary slowly in time. This is in contrast to the common approach of 
simply adding white noise to the rate equations. 

A potential drawback of this approach is that it increases the complexity and dimensionality of 
the model and the inference problem. We have augmented the four-dimensional primary state space 
(N,P,Z,D) with nine additional state variables (B). Instead of estimating nine constant param- 
eters, we must estimate nine means and nine variances controlling the evolution of the stochastic 
processes B(t). We have mitigated this problem by using prior information to set the relative 
magnitudes of the variances of phytoplankton and zooplankton community parameters, and using 
stochastic factors related to community diversity to set the absolute magnitude. One advantage 
of the process model, as formulated, is that it allows a strong and direct connection to literature 
meta-data on the distributions of eco-physiological parameters across species. This allows us to 
set informative objective priors for most of the parameters, exploiting a key advantage of Bayesian 
approaches, and partially counterbalancing the increase in unknowns. 

The inference procedure was designed to derive joint posteriors for system parameters and the 
(augmented) system state. Many examples of data assimilation in dynamical models concentrate on 
either state estimation or parameter estimation. Joint inference is particularly difficult in nonlinear 
models with sparse data, and it has typically required strong simplifying approximations, such as the 
replacement of nonlinear dynamics by approximating linear models. The underlying deterministic 
NPZD model is highly nonlinear, displaying two qualitatively different modes of behaviour or local 
stability domains, and the observed behaviour at OSP correspond to only one of these domains. 
The new particle MCMC techniques employed here are able to cope with this nonlinear, threshold 
behaviour, but are computationally expensive. 

Given these challenges, the results of the OSP case study offer a number of grounds for encour- 
agement. First, the stochastic process model allows the construction of priors over the model state, 
by drawing random samples from the prior distribution for model parameters and initial conditions, 
and running ensembles of model simulations. We can think of this prior ensemble as encapsulating 
our ability to predict system behaviour at OSP, given independent scientific knowledge about BGC 
processes, and local environmental forcing, but no other local knowledge. Encouragingly, the state 
median in these prior distributions bears a strong qualitative and even quantitative resemblance 
to OSP observations (Figures |4j [7|) , even though the priors were chosen to reflect the full range of 
species attributes reported in the literature. But the 95% credibility intervals for the prior distri- 
bution also include trajectories involving phytoplankton blooms and nitrate depletion, which are 
incompatible with observations at OSP. 

Data assimilation into dynamical process models can serve a variety of different diagnostic and 



prognostic purposes (see e.g., Gregg, 2008, Luo et al. 2011). One class of diagnostic applications 



targets the hindcasting or nowcasting of system state, given limited observations. Despite sparse 
observations with large sampling errors on one state variable (iV) and one diagnostic variable 
(Chla), the Bayesian inference procedure recovers quite tight posteriors for these observed variables 
(Figure [7]). The Bayesian inference procedure is also able to transfer information from observed to 
unobserved state variables, reducing the uncertainty in the unobserved state variables (P, Z and 
D) by about half (Figure |8j). 

A second class of diagnostic applications focuses on learning about, and interpretation of, model 
parameters. Here, the parameters describe the ecological characteristics of the plankton communi- 
ties present at OSP. To the extent that these parameters have smaller variances a posteriori, we can 
conclude that the observations have provided information about the parameters and the communi- 
ties they represent. The results for OSP are informative and cautionary. In the twin experiment, 
with observations on all state variables, the posteriors for some parameters are essentially identical 
to the priors, so provide no additional information (Figure [5]). These results can be explained in 
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terms of model dynamics. Since nutrients under OSP conditions are always saturating, and phyto- 
plankton concentrations remain low, the parameters affecting phytoplankton growth at low nutrient 
concentrations, and zooplankton ingestion at high phytoplankton concentrations, have negligible 
effect on model predictions, and are not identifiable. This pattern of identifiability is an intrinsic 
characteristic of the environmental forcing and dynamics at OSP. In other ocean conditions, such 
as oligotrophic mixed layers where nutrient concentrations are always low and limiting, one would 
expect different sets of model parameters to be identifiable. 

Using the limited set of historical observations available for OSP, the inference procedure is 
able to extract information about a few key parameters only (Figure [9]). For the most part, 
these parameters directly control the key processes involved in zooplankton grazing control of 
phytoplankton. The posterior distributions for the parameters controlling the variance in model 
parameters (PDF and ZDF) are shifted towards lower values, compared with the priors. At the 
inferred lower levels of stochastic noise, trajectories are less likely to escape the local stability 
domain corresponding to grazing control. 

The Bayesian inference procedure provides posterior distributions for the trajectories of the 
stochastic BGC processes, B(i). Reliable information on changes in these processes would be of 
particular interest to plankton ecologists. However, even in a twin experiment with daily data 
on all state variables, we were only able to obtain suggestive (but not confirmatory) information 
about temporal variation in two parameters. This limited success is understandable, given that we 
are effectively trying to extract information about changes in unobserved variables on relatively 
short time scales, when the evidence of these changes is available only indirectly through changes 
in the time derivatives of the observed state variables. Even modest levels of observation noise 
are sufficient to confound this attempt. We conclude that higher frequency observations, and /or 
lower observation noise, would be required to learn about temporal variation in these community 
properties from observations of state variables alone. 

A twin experiment with similar forcing, sampling pattern and observation noise to the histori- 
cal OSP observations yielded qualitatively equivalent results to those obtained using the real data. 
While not conclusive, this does suggest that the limited information about state and parameters ob- 
tained using the historical observations may be attributed to their sparseness and high observation 
noise, rather than an inconsistency with structural model assumptions. 

It is common practice to distinguish short-term forecasts, in which uncertainty is dominated by 
the error in estimates of the current system state, from longer-term forecasts or projections, in which 
uncertainty may be dominated by errors in model structure, errors in parameter estimates, and the 
underlying stochastic process error. The methods used here allow us to move seamlessly from short- 
term to long-term forecasts. The forecast results are encouraging (Figure 10), especially given that 
the inference procedure and observations have provided information about a small subset only of 
model parameters. This limited information, combined with prior information on other parameters, 
is sufficient to produce a long-term forecast that agrees both qualitatively and quantitatively with 
observations. 

Given the limited identifiability of both parameters and the related stochastic biogeochemical 
processes, one could reasonably ask whether the model is over-parameterised. This would be the 
case if we were building a model specifically for the purpose of explanation or prediction at OSP 
that ignored prior information on model structure and parameters. However, we are engaged in 
developing a generic model, based on well accepted principles and strong prior information. The 
model is applied at OSP, but we envisage the same model (or a similar model) being applied at many 
other locations, and in the long run used as a basis for basin-scale or global BGC models spanning 
many different environmental conditions. Under these circumstances, it would be inappropriate to 
eliminate processes from the model on the grounds that they are not important at OSP, or that 
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they are not identifiable from a particular set of historical observations from OSP. We are interested 
rather in the question of what such a model allows us to infer and predict about OSP and other 
regions, given generic objective prior information and the limited available observations. 

Models with many, poorly identified parameters can be subject to over-tuning and poor predic- 
tive performance, especially if parameter estimation procedures are heuristic, and/or are designed 
to produce a single "optimal" parameter set. The BHM framework and inference procedures used 
here provide protection against over-tuning. The posterior distribution yields samples from the full 
range of possible parameters and states, conditional on priors and observations, and it therefore 



provides a realistic picture of the effects of equifinality (Von Bertalanffy, 1969 Beven and Binley 



1992) on model hindcasts and predictions. The performance of the posterior for the long-term 



forecast for OSP (Figur e[To|) supports this conclusion. 

Emerging observing systems promise much richer data sets than in the past. New automated in 
situ and remote sensors can provide data for more variables with much higher temporal resolution 
and/or spatial coverage. The twin experiment with daily observations presented here provides a 
hint of what we might expect from such improved observing systems. Data assimilating models are 
increasingly being used to assess the information value of alternative observing system designs, as 



part of so-called Observing System Simulation Experiments (OSSEs; e.g., Masutani et al. 2010) 



The twin experiments presented were intended primarily as a check on the consistency and per- 
formance of the inference methods; an OSSE would require careful attention to observing system 
elements and costs, and the use of replicate experiments. We anticipate using the BHM framework 
to build OSSEs. Oceanographic field studies often include local in situ or ship-board experiments 
that effectively measure the instantaneous values of community ecophysiological properties. The 
model formulation proposed here offers the opportunity to integrate these measurements with stan- 
dard observations of state variables (biomass) within a consistent and rigorous inference framework. 
We see this as an interesting direction for further research using both OSSEs and real observations. 

We recognise that, in order to fully exploit the potential for OSSEs, and for hindcasting and 
forecasting more generally, it will be necessary to extend our approach from the 0-D box model con- 
sidered here to spatially resolved models, including both 1-D vertical mixing models (e.g. Mattern 



et al. , 2010) and 3-D circulation models (cf. Gregg, 2008). The adoption of spatially resolved models 



would avoid the ambiguity about spatial scales inherent in the box model and allow a more rigorous 
treatment of spatial sampling errors. We do not foresee major conceptual problems in extending 
the formulation to spatially resolved models, but Bayesian inference in these models will involve 
formidable computational challenges, and may require the development of effective approximate 
techniques. 

We believe that the example presented here delivers at least in part on the promise described by 



Berliner (2003) and Cressie et al. (2009) of BHM as a self-consistent probabilistic framework that 



integrates statistical and mechanistic process models. The specific process model developed here 
shows promise as a basis for applications for many local and regional aquatic BGC applications. 
We hope that some of the methods developed here, including the use of stochastic processes for 
aggregate community properties, will find broader application. 



Acknowledgements 

We are indebted to two anonymous referees for their careful and constructive reviews of an earlier 
version of this paper, which led to significant improvements both in clarity and presentation. The 
guidance provided by Editor-in-Chief David Schimel is gratefully acknowledged. We are grateful 
for helpful comments and constructive criticism at various stages of this research from John Taylor, 



27 



In press at Ecological Applications 



Richard Matear, Yong Song, Nugzar Margvelashvili and David Clifford. We are also grateful for 
the input of many colleagues in helping us formulate our thinking, and in particular we thank 
Karen Wild- Allen for her time and patience in helping the more statistically-inclined members of 
the project team understand more of BGC modelling and its applications. We have benefited from 
Peter Oke's many insights from a data assimilation perspective. We thank Arnaud Doucet for his 
insights on sequential Monte Carlo, which led to a much-improved inference algorithm. Bronwyn 
Harch was instrumental in starting the project leading to the work reported here, and we thank 
her for her enthusiasm and support. Finally, the third author is particularly grateful to Mavis Dias 
for her help, enthusiastically given, at a number stages in the preparation of this paper. 

A Process Model: Supplementary Material 

A.l A simple adaptive model of phytoplankton growth and composition in 
response to light, nutrient, and temperature. 

The phytoplankton growth model used in this paper predicts changes in phytoplankton specific 
growth rate g and composition (nitrogenxarbon chlorophyll-axarbon ratios) in response to changes 
in incident irradiance E, temperature T, and dissolved inorganic nitrogen N. The formulation 
represents a compromise between realism and complexity. Consistent with the BHM framework, 
we have sought a formulation that explicitly connects the key observed quantity chlorophyll-a 
{Chla) to the state variable of phytoplankton biomass (P) using the common currency of nitrogen. 
By explicitly treating changes in nitrogenxarbon ratios as well, the formulation would also support 
observations of dissolved oxygen or dissolved inorganic carbon, although we do not treat those in 
this paper. At the same time, we have avoided introducing additional hidden state variables, and 
we have sought to minimize the number of new parameters. 



This formulation draws on the adaptive phytoplankton growth model of Geider et al. (1998), 
who specified the carbon-specific phytoplankton growth rate, gc, and the nitrogen-specific phyto- 
plankton growth rate, gpf, as follows: 

gc = gZ.O-e X p(-a.\.E/g%)), (25) 
9d = 9 max -Tc.( X - x mm )/(x max - x min ), (26) 



g N = V m .N/(K + N), 



(27) 



ym _ Y max Tc.(x T 



x)/(x r ' 



X 



(28) 



Here, the temperature correction factor Tc is based on a "Qio factor" (see Eq. (10) in the 
main text), A is the chlorophyll-axarbon ratio, K is the half-saturation constant for phytoplankton 



growth on N, and a is the initial slope of the photosynthesis versus irradiance curve. In Geider et al. 



(1998)'s model, the light-saturated C-specific phytoplankton growth rate, g™, and the nutrient- 
saturated phytoplankton nitrogen-specific growth rate, y m , depend on the nitrogenxarbon ratio, 
X, in a way that ensures that the ratio lies between ^ mm and ^ max . We may think of this formulation 
as an extensive version of a cell quota model. 



Geider et al. (1998) introduced a third expression for the phytoplankton CWa-specific growth 



rate, gchh as a function of gc, gN, A, E, and x- Here, we have avoided introducing new dynamic 
state variables for phytoplankton carbon and phytoplankton chlorophyll-a. Instead, we have sought 
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a solution for gjy, A and Xt as functions of T, N and E, under conditions of balanced growth when 
9c = 9n = 9chia- It is not possible to derive an explicit expression for these solutions in the original 
model. However, it is possible to do so with the following key simplification. 



Behrenfeld et al. (2005) argue that phytoplankton adjust their chlorophyll-axarbon ratio, A, so 



that it is proportional to g™ under balanced growth. We assume that, at steady-state, 

A = \ max .g%/g max , (29) 



and substituting (29) into (25) gives 



g c = g%.{l - exp(-a.\™*.E/g™ x )). (30) 

We define g* E as the maximum C-specific phytoplankton growth rate for given irradiance E and 
temperature T. This is achieved at x = x max and is given by: 

g * E = g max .Tc.(l - exp(-a.\ max .E/g max )). (31) 

We define g* N as the maximum iV-specific phytoplankton growth rate for given nutrient concen- 
tration N and temperature T. This is achieved at x = x min and is given by: 

g * N = V max .Tc.N/(K + N). (32) 

Now, g c = g* E .( X - x min )/(x max ~ X min ), and g N = g* N .( X max - x)/(x max ~ X min )- 

Under balanced growth, the specific growth rate g = gc = 9n, and solving for x gives: 

x = (g* E -x mm + 9N-x max )/{g*E + 9n)- (33) 



Substituting (33) into gc = g%-(x ~ X mm ) I (x max — X mm ), an d appealing to the equality g = gc = 
9n i yields: 

9 = 9*e-9*n/(9e + 9*n) (34) 



and, from, (29) 



X = X max .Tc.g* N /(g E + g* N ). (35) 

These expressions have a simple and logical interpretation. The rates g* E and g* N are the potential 
specific phytoplankton growth rates determined by light and nutrient separately, and we can think of 
them as measures of light and nutrient availability as perceived by the cell. The achieved growth rate 
g is a compromise between the two. If one is much larger than the other, then the achieved growth 
rate is very close to the smaller rate. Note that simple non-adaptive models of phytoplankton 
growth often multiply a light-limitation and a nutrient-limitation term. It is well known that 
this underestimates growth rates. Our approach partly avoids the defect of multiplicative growth 



models. In (33), the nitrogenxarbon ratio approaches x mm when light is much more available 
than nutrient (g E >> g* N ), and it approaches ^ max when g* N » g* E , as one might expect. The 
chlorophyll-axarbon ratio, A, approaches \ max when light is limiting and nutrient is abundant 
(<?tv >> g%) j an d it approaches zero when nutrient is strongly limiting and light is saturating. 

It is possible to treat gN as an iV-specific uptake rate, and for the maximum specific uptake rate 
of nitrogen (y max ) to be substantially larger than the maximum C-specific growth rate (g max ). 
This would allow rapid uptake of nitrogen in a dynamic quota model. However, given that we 
are considering only balanced growth here, we treat g^ as a specific growth rate and assume 
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ymax _ ^max _ ^y e mug £ thgn interpret K as a half-saturation constant for growth rather than 
uptake: 

max T 

if=± -. (36) 

a N 



With this simplification (substitute (|36j) into the N/(K + N) term of (32)), we can write: 



h N = N/((g max .Tc/a N ) + N), (37) 



and from (31) we set 



h E = l- exp(-a.\ max .E/g max ). (38) 
As before, under the assumption of balanced growth {g = gc = 9n), we obtain 

g = Tc.g max .h E .h N /(h E + h N ), (39) 

and 

X = (h E . X min + h N . X max )/(h E + h N ). (40) 
If we set R N = x min /x max , then 

X = X max -(h E .R N + h N )/(h E + h N ), (41) 

and 

A = X max .Tc.h N /(h E + h N ). (42) 
The phytoplankton chlorophyll-a:nitrogen ratio equals A/x, and it is given by: 

Chla : N = (X max .Tc/x max )-h N /(h E .R N + h N ). (43) 

A. 2 The transport operator 

In this section we describe the transport operator, T(c,x, t), used in equation §5§ in the main 
text. The specific form applied to each of the state variables differs based on the characteristics of 
the specific state variable being operated on. 

The change in mixed layer depth (MLD) is given by: 

and we define 

V> + (f) = max{^(t),0}. (45) 



This form of exchange across the mixed-layer has been adopted from Evans and Parslow (1985) 



Therefore the state equations, including the effects of changes in the mixed layer, can be written 



as: 



f -i-p-r-z+f^.wp-n (40, 
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^ = Ez . g r.Z-m.Z + ^ 5 . i BCZ-Z), (47) 

— = {\-E z )-f D -gr-Z + m-Z-r-D-S D - D/MLD + {f T ^ ' ■ {BCD - D), (48) 
at MLJJ 

^- = - g . P + {I - E Z ) ■ {I - f D ) ■ gr ■ Z + r ■ D + K+ ^ t] ■ (BCN - N), (49) 

where k is the background mixing, Sd is the detrital sinking rate (a parameter subject to inference), 
and BCP, BCZ, BCD and BCN are the boundary conditions for P,Z,D and N, respectively. 
With the exception of BCN all other state variable boundary conditions are set to 0. It is assumed 
that as the mixed layer shoals, P, D and N are lost from the mixed layer, whereas Z is assumed to be 
retained in the mixed layer. As the mixed layer deepens, P, Z and D concentrations will be diluted, 
whereas N will be added in an amount proportional to the prescribed boundary condition BCN. 
The variables k, ip, MLD and BCN are considered to be exogenous forcing and are prescribed. 

A.3 The light model 

E = E .(l-exp(-Kz))/Kz, (50) 

where Eq is the mean daily photosynthetically available radiation (PAR) just below the air-sea 
interface, and Kz is given by: 

Kz = (K w + a Ch • Chla) ■ MLD. (51) 



In (51), K\y is attenuation due to the seawater, ach is attenuation due to Chla and MLD is the 



mixed layer depth. 

A. 4 The parameter (prior) model 

Where possible, the priors for the process-model parameters have been based on information in 
the literature. The parameters can be divided into three classes. 

For some physiological parameters, there are existing meta-analyses in the literature that provide 



estimates of parameter means and variances. Previous studies of phytoplankton by Tang (1995) 



and Montagnes et al. (1994), and of zooplankton by Hansen et al. (1997), have derived allometric 
relationships (log-log regressions) for phytoplankton and zooplankton parameters g max , \ max ^ E z , 
Clz and Iz as a function of individual size. From these data sets, we have derived prior means and 
coefficients of variation (Table [TJ, assuming that the phytoplankton community at OSP is domi- 
nated by small cells (mean cell volume 100 /mi 3 ), and grazing is dominated by microzooplankton 
(mean individual volume 10 5 /mi 3 ). We have used normal prior distributions for those parameters 
having small coefficient of variation, and log-normal prior distributions for the rest. 

For some parameters, we can draw on a range of quoted values in the literature that are sufficient 
to provide crude estimates of prior mean and variance. Historical observations of the light attenu- 
ation due to water, Kw, and the specific absorption coefficient for chlorophyll-a, achi were taken 



from Kirk (1994). The maximum quantum yield is assumed constant at 0.1 mol C mol photons , 



or 1200 mg C mol photons . Estimates of the affinity of phytoplankton for dissolved inorganic 
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Table 1: The priors on parameters used in the stochastic NPZD model are all log-normal, with 
mean (//) and standard deviation (a) on the log-scale, except for Sd which is Gaussian. 



Parameter 


Description 


Mean(^i) 


SD (a) 


K w 


Light Attenuation: Water 


0.03 m" 1 


0.2 


a-Ch 


Light Attenuation: Chla 


0.04 m 2 mg Chla" 1 


0.3 


S D 


Detrital Sinking Rate 


5 m d _1 


1.0 


fo 


Fraction of grazing to detritus 


0.5 


0.1 


PDF 


Phytoplankton Diversity Factor 


0.15 


0.4 


ZDF 


Zooplankton Diversity Factor 


0.15 


0.4 




Maximum Carbon Specific Growth Rate 


1.2 d" 1 


0.63 


VRn 


Ratio between ^ mm and ^ max 


0.25 


0.3 




Maximum Chla to Cratio 


0.03 


0.37 


Ha N 


Phytoplankton affinity for A 


0.3 d" 1 mg N- 1 m 3 


1 


Viz 


Maximum Zooplankton ingestion rate 


4.7 d- 1 


0.7 


VCl z 


Maximum Zooplankton clearance rate 


0.2 m 3 mg A" 1 d _1 


1.3 


HE Z 


Zooplankton growth efficiency 


0.32 


0.25 


HmQ 


Zooplankton quadratic mortality rate 


0.01 d" 1 mg A^ 1 m 3 


1 




Detrital remineralization rate 


0.1 d- 1 


0.5 



nitrogen, a at, are based on data in Hein et al. (1995) and on theoretical calculations of the diffusion 
limit to uptake. 

Other parameters can be regarded as semi-empirical, representing ecosystem properties and 
processes that are only crudely represented in the model. We do not model zooplankton respiration 
explicitly, and we assume that approximately half of the ingested nitrogen that does not appear as 
an increase in biomass is released as unassimilated fecal pellets, and half is lost through respiration 



and excretion of dissolved inorganic nitrogen (cf. |Parsons and Takahashi 1973), so Jd is given a 



prior mean of 0.5 and a small CV of 0.1 (Table [TJ. Because zooplankton grazing is assumed to be 
dominated by micro-zooplankton, we have assigned the detrital sinking rate Sd a relatively small 
prior mean of 5 m d , with a large CV of 1.0. 

Detrital organic matter comprises diverse organic compounds that vary widely in their suscepti- 
bility to bacterial attack and remineralization. In the model, very labile organic nitrogen compounds 
such as amino acids, which may be utilized and remineralized on time scales of hours, are treated 
implicitly as part of the fraction (1 — Ez)-(1 — I'd) of ingestion that is released directly as dissolved 
inorganic nitrogen. A mean remineralization rate of 0.1 d _1 is applied to the remaining detritus, 
with a relatively large CV of 0.5. The quadratic mortality rate for zooplankton, rriQ, is an empirical 
ecosystem parameter, representing the density-dependent predation on micro-zooplankton. It has 
been assigned a mean value of 0.01 d _1 (mg A m~ 3 ) , corresponding to a mortality rate of 0.1 
d _1 at typical micro-zooplankton biomass levels of 10 mg A m~ 3 . Because we have little prior 
information to constrain ttiq, we have assigned it a large CV of 1.0. 

The scale factors PDF and ZDF have been assigned a prior mean of 0.2. This is a relatively low 
value, and it corresponds to a diverse community in which the community mean properties show 
substantially less variation than those of individual species (see Appendix [B]). The prior variance 
is set to 0.4, so the prior distribution allows higher values of PDF and ZDF, and also less diverse 
communities. 
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B Autoregressive Process with Log-Normal Innovations 

Suppose that a time series {B(t)} evolves according to an AR(1) process with an intrinsic time- 
scale of r. Then, from equation (20) in the main text, 

B(t + l) = B(t)(l-l/T) + ( B (t)/T. (52) 

The innovation sequence {Cb(*)} is independent of {B(t)} and is assumed to follow a log-normal 
distribution, so that Z = log(£e(i)) ~ N(fi z ,a 2 ). The moments of the log-normal distribution 
are immediately available from those of the normal distribution since E[(,B(t) s ] = E[exp(sZ)], the 
moment generating function of the underlying normal distribution. We find then that 

E(( B (t) s ) = exp(sfi z + s 2 a 2 /2), s > 0. (53) 
Hence the mean and variance of {Cb(£)} follow directly as 



H ( = exp(n z + a 2 /2), 

a\ = exp(2/x z )(exp(2 C j2) -exp(cT 2 )). (54) 
The coefficient of variation, CV((b) = cc/^C' is given by 

CV(Cb) = (exp(^) - l) 1 / 2 , (55) 

and so it does not depend on the mean. 

Assuming second-order stationarity of {B(t)} requires that |1 — l/r| < 1, so r > 1/2. Taking 



expectations on both sides of equation (52), we find (using an obvious notation): 



MB = Mb(1 - 1/t) + Mc/ t j ( 56 ) 

so 

MB = MC = ex P (Mz + ell 2 ) ■ ( 57 ) 

Next, calculating the variance of both sides using the independence assumption above, we obtain 
again using an obvious notation): 



a% = (l-l/r) 2 a 2 B + a 2 /r 2 

= (2r- l)- 1/2 al (58) 

Since fis = Mo 



CV{B) 2 = CV{C b ) 2 /{2t-1) 

= (exp(a 2 )-l)/(2r-l). (59) 

We may apply this result as follows. Suppose a community is made up of n species, each with 
biomass fractions {pi : i = 1, . . . , n; Y^Pi = 1}> an d suppose further that the species-specific growth 
rate parameters <^ are log-normally distributed, where log(cf> i ) has mean and variance a 2 ^. We 
are interested in the biomass- weighted community value, $ = Y^Pi^i- Hence, 
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Further, 



Thus, 



var 



exp(/i^ + <7^/2). 



'*) = 5^Pi 2 var(^) = ^p i 2 exp(2 W(/) )(exp(2a / 2 (/) ) - exp(of )). 



(60) 



(61) 



Cy($) 2 = ^ K 2 (exp(^)-l). (62) 
In our case, we want to choose fj, ( and a ( so that E{B) = E{§) and CV(5) 2 = CV{<&) 2 . Using 



(57), (59), (60) and (62), we have 



Finally, 



(exp(c7 2 ) 



exp{^ +cr 2 /2) 
l)/(2r-l) 



exp(^ + ofj/2), 
5^Pi 2 .(exp(<r^) - 1). 



^ = /J^ + o^/2 - of /2, 



(63) 
(64) 

(65) 
(66) 

For those eco-physiological parameters modelled here as a stochastic autoregressive process 
{B(t)}, we derive prior estimates of fii^ and cr/^ from the literature. We treat the quantities YIpI as 
constants. For the phytoplankton community, ^2p 2 = PDF 2 , and for the zooplankton community 
^2, pf = ZDF 2 . The diversity factors PDF and ZDF (recalling that these scale the prescribed 
values) and fj,^ are parameters subject to inference, and they are given prior distributions (see 



a 2 = log(l + (2r-l)^p i 2 (exp(a 2 /) )-l) 



Appendix A. 4) 



C Particle Markov Chain Monte Carlo Sampling 

For brevity, we use the notation Wj : j (where i < j) to denote the sequence (Wj, . 
likewise for Yj :J -. We may write the posterior distribution as 



. , Wj ) , and 



[W 1:T ,0 w ,0 Y |Y 1:T ] 



\Wi-.t\0\V, &Y, Yi : t] [Ow , #y| Yi : t]. 



(67) 



The first factor of Equation (67) may be computed using a particle filter (Gordon et al. 1993 



Doucet et al. 2001), with parameters fixed according to a draw from the second factor, considered 



below. At time t, let [Wi : t|0\v, #Y, Y13] be represented by a set of N samples (particles) w\. t 
with associated weights a\, for i = l,...,N. Given {(w\. t _ 1 ,a' l t _ 1 )}, we may recursively obtain 
{( w i:t> a t)} W setting, for i = 1, . . . , N, 



w 



a 



o 



Wi|W t _i 

Y f |W t 



t-i 



-li^w] 

wt,e Y }. 



(68) 
(69) 
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The procedure is initialised with w ~ [Wo] and a = 1, for i = 1, . . . , N. After several iterations, 
the sample tends to degenerate to a situation where only a single particle has significant weight. For 
this reason, resampling is typically performed after each step, redrawing particles with probability 
proportional to weights, and resetting all weights to be identically one. This has the effect of 



eliminating particles of low weight and replicating particles of high weight (Gordon et al. 1993 



Kitagawa, 1996) 



Note that, by time T, the procedure has produced weighted samples of the target factor 
[Wi : t|0w> 0Yj Yi : x] of Equation (67). Sampling-importance-resampling (SIR) may be used to 
obtain a single sample from among these, making a single multinomial draw from the set {w^. r } 



with assigned probabilities {a l T / Y2j=i oP T }. 

Turning to the second factor of Equation (|67l) , note that 



oc 



[Yi : t|0Wj 0y][0Wi Qy], 



and that a particle filter may again be used to compute the likelihood term: 



(70) 



[Yi : t|0w, Qy] 



[Yi|0 w , Y ] fl[Y t |Yi :t _i, 6> w , Oy] 

t=2 

/+oo 
[Yi|Wi,e Y ][Wi]<nvi x 
-oo 

T r+oo 

Yl / [Y t \W t ,0Y][W t \Y 1:t - 1 ,e w ,0Y}dW t 

i_n J —OO 



(71) 



t=l i=l 



(72) 
(73) 



where we have assumed in arriving at the last line that resampling is performed at each recursion 



of the particle filter. See Andrieu et al. (2010) for details 



The computation of this likelihood facilitates a Metropolis-Hastings (Metropolis et al. 1953 



Hastings, 1970) sampling of [#w> 0y|Yi : t]- For each parameter sample accepted, a state sample is 
drawn from the same particle filter used to obtain a likelihood estimate, resulting in a full sample 



of the posterior distribution given by Equation (67). 



Finally, we require a suitable proposal distribution for the Metropolis-Hastings sampler. In 
this work, we use the approximate (log-) normal posterior furnished by a joint unscented Kalman 



filter (Julier and Uhlmann 1997; Wan and van der Merwe, 2000) performing online process and 



parameter inference. After some number of steps, this is discarded in favour of a (log-)normal 



proposal adapted from the history of posterior samples (Haario et al. 2001). 



D Case Study: Supplementary Material 



D.l Forcing Data 

The forcing data used for the climatological twin-experiment and case-study was derived from 



that used in the study outlined in Matear (1995). The exogenous forcing that was applied is given 



in Figure 11 The forcing used for the 1971-1975 OSP experiment using the historical observations 



was again derived from the dataset used by Matear (1995) and is contained in Figure 12 The time 
series of Mixed Layer Depth (MLD) is used in equation (|5j) in the main text to form the transport 
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operator and is discussed in Appendix A. 2 Furthermore, the daily change in MLD (44) is used to 



parameterise the exchange across the base of the mixed layer. The sub-Mixed Layer DIN is the 
boundary condition (BCN) referred to in Appendix A. 2 The Mixed Layer Temperature is used in 
the temperature correction term (equation (10) in the main text) for many of the rate processes 
given in Section 3.1 in the main text and Appendix A.l while the photosynthetic available radiation 
(PAR) immediately below the surface is used to drive the photosynthesis in the phytoplankton 
growth model (Appendix A.l). 
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Figure 12: The forcing used in the Ocean Station Papa case study. 
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